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Abstract 

A new Monte-Carlo method for solving linear parabolic partial differential equa- 
tions is presented. Since, in this new scheme, the particles are followed backward 
in time, it provides great flexibility in choosing critical points in phase-space at 
which to concentrate the launching of particles and thereby minimizing the statis- 
tical noise of the sought solution. The trajectory of a particle, X$(t), is given by the 
numerical solution to the stochastic differential equation naturally associated with 
the parabolic equation. The weight of a particle is given by the initial condition of 
the parabolic equation at the point Xj(0). Another unique advantage of this new 
Monte-Carlo method is that it produces a smooth solution, i.e. without 5-functions, 
by summing up the weights according to the Feynman-Kac formula. 
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1 Introduction 



The Monte-Carlo method was conceived at the Los Alamos National Labo- 
ratory during the Manhattan Project as a numerical method for solving the 
Boltzmann equation governing the neutron distribution function in fissile ma- 
terial [1]. Since then it has found numerous other uses across many fields of 
science. Overviews of its areas of applicability can be found in any of a num- 
ber of textbooks, see e.g. Ref. [2]. In plasma physics the Monte-Carlo method 
has a long history of being used for solving the Fokker-Planck equation; see 
Ref. [3] for some of the earliest examples. It has two main advantages: it 
keeps small the incremental effort of solving a higher-dimensional problem, 



E-mail: carlssonja@ornl.gov 



Preprint submitted to Elsevier Preprint 



1 February 2008 



and it makes easy satisfying boundary conditions. The Monte-Carlo method 
is typically worth consideration for three- or higher-dimensional problems, 
or already in two dimensions if there are internal boundary conditions im- 
posed. The biggest disadvantage is the unavoidable statistical noise caused by 
the use of random numbers. This noise scales as the inverse square root of 
the number of particles followed (and hence the number of arithmetic opera- 
tions and memory accesses required). The poor scaling is somewhat offset by 
the fact that the Monte-Carlo method "can be applied by many computers 
working in parallel and independently" as Metropolis and Ulam pointed out 
more than half a century ago [1]. In many cases, e.g. when a tail distribution 
forms, low-density regions of phase space are of particular interest. An exam- 
ple of such from plasma physics is when high-power radio waves are 
launched into the plasma and absorbed through resonance with the gyration 
of ions around magnetic field lines, resulting in the formation of a tail of high- 
energy ions. The Monte-Carlo method has been used on numerous occasions to 
solve the quasilinear Fokker-Planck equation, which models such wave absorp- 
tion [4]. Due to the inverse-square-root scaling, the relative statistical error 
is worst in exactly these interesting low-density regions. Increasingly sophisti- 
cated weighting (i.e. splitting of particles), and reweighting, schemes [5] have 
been suggested to make the relative statistical error more constant throughout 
phase space. 

The ^/-method [6] also exists in a collisional version [7-9]. There seem to be 
two different schools of thought on how to include the collisions: by making the 
particle trajectories stochastic [7] or by letting the collisions enter the weight 
equation [8,9]. With the latter approach the spreading of the weight causes a 
gradual increase of the statistical error for a fixed number of particles. This 
makes simulations problematic on timescales longer than a few collision times. 
A potential cure for this particular ailment has recently been suggested [9]. 
Since the ^/-method assumes that the solution is a weak perturbation of the 
equilibrium solution everywhere in phase space, it cannot be used when the 
distribution function develops a tail. 

The new Monte-Carlo method presented here is firmly based on the well- 
established Feynman-Kac formula, which is briefly introduced in section 2. 
The Feynman-Kac formula puts the solution of a parabolic partial differential 
equation on the form of a conditional expectation value of a function of a 
stochastic variable. In section 3 it is shown how the numerical evaluation of this 
expectation value takes the form of a Monte-Carlo method stepping backward 
in time. Going backward in time allows us to choose the exact points in phase 
space, e.g. on an equidistant grid, at which we calculate the solution. It also 
allows us to redistribute the statistical noise to regions of phase space where it 
does the least harm. Finally, a comparison with the traditional Monte-Carlo 
method can be found in section 4. 
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2 The Feynman-Kac formula 



There is an intimate connection between linear parabolic partial differential 
equations (PDEs) and stochastic differential equations (SDEs). Let us illus- 
trate by using the regular diffusion equation as an example: 

with some initial condition f(x, 0) = $(x). Now, let the process X(t) be 
governed by the SDE, 

dX = fidr + adW , (2) 

where W is a Wiener process, r = T — t, and we impose the initial condition 
X(0) = x. Using Ito's formula [see Eq. (A. 7) in appendix A] to differentiate 
f( x ( r ), T ), w e get: 

/(X(0),0) = f(X {n T) + /; {% + 4§} d r + /; a | M> 

(3) 

(To make this article more self-contained a brief introduction to stochastic 
calculus and a mathematically somewhat questionable, but hopefully eluci- 
dating, derivation of Ito's formula is presented in appendix A for those who 
are unfamiliar with the formalism). By defining, 

dD , 

^=^> ^ = v / 2D, (4) 

we make Eq. (2) the naturally associated SDE of the linear parabolic PDE (I). 
Using the definition (4), Eq. (1), and the initial conditions f(x, t = 0) = 
and X(t = 0) = x, Eq. (3) becomes: 

r° df 

f(x,t=T) = *(X(t = 0)) + J t °Yx dW ■ 

Taking the expectation value of both sides we obtain: 

f(x,t=T) = E[$(X(t = 0)) | X(t = T) = x] , (5) 

which is known as the stochastic Feynman-Kac representation of f(x,T), or 
the Feynman-Kac formula for short. (See Ref. [10] for a thorough presentation 
of stochastic calculus and the Feynman-Kac formula). 
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3 Backward Monte-Carlo method 



In general, the expectation value on the RHS of Eq. (5) must be calculated 
numerically. We first integrate the SDE (2): 



/T + At r-T + At l-T + At 

dX = j /j,dr + J adW =>• 



X(t + At) - X{t)=h[1 + 0{VAt)}At + 

a[l + 0(Va1)][W(t + At) - W{t)\ 



The 0{y At) error terms come from the variation of \i and a during the time 
step At. By the definition of a Wiener process (see appendix A), 
W(t + At) - W(t) e N(0, VA1) and we can write: 



X(r + At) = X(t) + //At + (aVAt + O(At) , 

where ( is a zero-mean, unit- variance Gaussian random number, ( € N(0, 1). 
The numerical approximation of the Feynman-Kac formula (5) is simply: 

N 

f(z,t=T) = N- 1 ]T ^Xi(t = 0)) + O(At) + OtN- 1 ' 2 ) , (6) 
i=i 

where 0(N~ 1 / 2 ) is the statistical error and the stochastic variables Xj(t = 0) 
are found by following the stochastic trajectories given by: 

Xi(t - At) = Xi(t) + fiAt + (aVAt , Xi(t = T) = x , i = 1, . . . , iV . (7) 



A very simple algorithm, illustrated in Fig. 1, can now be used to solve 
Eq. (1). Assume that we want the solution at time t — T at the points 
x — Xj , j — 1, . . . , n. Then, for each j, we simply launch particles at 
x = Xj and let them evolve according to the backward Monte-Carlo equation 
of motion (7). As they reach t = 0, we sample <&(#) at their respective loca- 
tions, Xj(t = 0), and calculate the solution f(x = Xj,t = T) as the average of 
the sampled values, $(Xj(t = 0)), just as prescribed by Eq. (6). 



Notice that the relative statistical error should be roughly constant if is the 
same for every j. Alternatively, as is indicated in Fig. 1, we can concentrate 
the launching of particles to exactly those points where a low-noise solution 
is desirable. In this sense, the backward Monte-Carlo method offers a perfect 
weighting scheme. 
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Fig. 1. The backward Monte-Carlo method. The trajectories as given by the back- 
ward Monte-Carlo difference equation of motion (7). 

The equivalents of Eqs. (6) and (7) when Eq. (1) is replaced by a more general 
linear parabolic PDE can be found in appendix B. 



4 Discussion 

The algorithm we arrived at in the previous section has a striking similarity to 
the conventional Monte-Carlo method, but there are also some fundamental 
differences. When comparing the conventional Monte-Carlo difference equa- 
tion of motion: 



to Eq. (7), it is evident that they both describe identical trajectories, but for 
Eq. (7) these trajectories are traversed backward in time. Since we are deal- 
ing with parabolic equations, moving backward in time raises questions about 
time reversibility and the change of entropy. The form of the solution (6) also 
raises some suspicion; / on the left-hand side is a macroscopic quantity and 
so is $ on the right-hand side, whereas Xi(t = 0) is microscopic. In itself the 
backward Monte-Carlo difference equation of motion (7) is perfectly legiti- 
mate; at the microscopic level the motion is time-reversible since the entropy 
is undefined. The potential danger lies in macroscopic information spilling 
over into the microscopic world; i.e. if the particles carried any information 
about the solution with them going backward in time, then clearly the second 
law of thermodynamics would be violated. Fortunately, the form of Eq. (6) 
guarantees that this will not happen since the particle weight $(Xj(t = 0)) is 
undefined until t — 0. Despite the superficial similarity between the Monte- 
Carlo difference equations of motion, (7) and (8), this is quite different from 



Yi(t + At) = Yi(t) + fiAt + (aVAl , i = l, ... ,N , 



(8) 
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a conventional, forward Monte-Carlo method where the particle weights are 
known at all times, t > 0. 



Another difference between the backward and the forward Monte-Carlo method 
is the very different character of the solutions. With the forward Monte-Carlo 
method an obvious weighting scheme would be to use the same weights as 
in the backward method and launch the particles with Yi(t = 0) uniformly 
distributed over some sub-interval of y (see Fig. 2). The solution then looks 
like 

N 

f(y,t=T) = N- 1 ]T *(*i(f = 0)) 6{y - Y t (t = T)) . 
i=i 



Just to do something as simple as plotting the solution, the particles have to be 
distributed in bins to smooth out the jaggedness and the solution interpolated. 
This obviously means that a trade-off between resolution and noisiness is un- 
avoidable. With the backward method, the solution [see Eq. (6)] is given as a 
numerical value at a point in phase space; it does not contain any 5-functions. 
Points can be arbitrarily close together to give the desired resolution without 
increasing the noise (see Fig. 1). It is also worth noting that the backward 
method makes it trivial to calculate the solutions for a whole set of initial 
conditions, f(x,t = 0) = $ m (x), once the trajectories have been traced back 
to t — and the X^t = 0) are known. This makes the summing of sam- 
ples $(Xj(t = 0)) in Eq. (6) somewhat similar to the convolution of a Green 
function with 




Fig. 2. Conventional, forward Monte-Carlo method with the same weights as the 
backward method. The trajectories as given by the forward Monte-Carlo difference 
equation of motion (8). 
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In the forward Monte-Carlo method, the drift [i and the diffusion function a 
are derived by taking moments of the single-particle distribution function [11], 
and the forward stochastic Monte-Carlo difference equation of motion (8) is 
normally seen as something rather artificial. From section 3 we see that in 
the backward Monte-Carlo method, the Monte-Carlo difference equation of 
motion (7) has a more natural interpretation as the numerical solution to the 
SDE (2) naturally associated with the parabolic equation (1) that we wish to 
solve. 

Finally, since it has a tendency to appear in contexts similar to this, a very 
brief comment on the Langevin equation, and why its use is deprecated, is 
made in appendix C. 



5 Summary 



A backward Monte-Carlo method for solving parabolic differential equations 
has been introduced. As compared to the conventional, forward Monte-Carlo 
method, which is derived by taking moments of the single-particle distribution 
function, the improved method originates from quite a different starting point: 
the Feynman-Kac formula. 

The stochastic Monte-Carlo difference equations of motion, including the drift 
/i and the diffusion function a, are identical in the conventional and the new 
scheme, except for one vital difference: in the new scheme the particles are 
followed backward in time. The similarity should make it easy to retrofit the 
backward method to existing Monte-Carlo codes. 



The solutions found with the forward Monte-Carlo method and the backward 
one, however, take completely different forms. In the backward scheme, the 
solution is smooth, unlike the jagged sum of (^-functions associated with the 
forward Monte-Carlo method. By default, the backward method also yields 
a solution with a roughly constant relative statistical error throughout phase 
space. In addition, it offers great flexibility in redistributing the statistical 
noise to corners of phase space where it does minimal harm. This latter capa- 
bility makes the backward method particularly well suited for cases where we 
are only interested in the solution in a small part of phase space. 
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A Stochastic calculus and Ito's formula 

We start with the stochastic differential equation (2), repeated here for con- 
venience: 

dX = fidr + adW , (A.l) 

where W is a Wiener process. Ordinary stochastic variables are just mappings 
from one probability space to another; stochastic processes are time dependent. 
A stochastic process W(t), t > is a Wiener processes iff: 

• W(0) = 

• the increment W{r + At) — W(t) , At > , is independent of W(s) , s < r 

• W(t + At) - W(t) e N(0, VAT) 

• W{t) has continuous trajectories 

We will first derive a differential identity that will be needed later in this 
appendix. We first define At = r J+ i — tj and AWj = W(tj + i) — Wijj) with 
Tj — j r/n , j = 0, 1, . . . — We are now ready to introduce the stochastic 
variable S n (r), 

n-l 

Sn{r) = E( A ^) 2 ■ (A.2) 

3=0 

If dW/dT had existed, then clearly S n (r) would tend to zero as n goes to 
infinity. But the derivative dW/dT is undefined everywhere, so we have to 
actually calculate the limit value. We take a congenially probabilistic approach 
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to this task. The expectation value E[S n ] is trivial: 



n— 1 n—1 n—1 n—1 

E[S n ] = £ E^AWj) 2 } = £ V[AW 3 ] = £ Arj = J>/n) = r . (A.3) 

J'=0 j=0 i=0 i=0 



To establish that the expectation value (A.3) is really the sought limit of (A. 2), 
we must show that the variance goes to zero when n goes to infinity. 

We will start by calculating e\x 4 ],X e N(0,a), 

(A.4) 



E 



X 4 



(2trx 



2\-l/2 



x e 2<t- 



' dx 



= (2na 2 y 1/2 r 3a 2 x 2 e-^dx = 3a 2 V[X] = 3V[X] 2 , 

J — oo 

where (• • • ) is an integration by parts. With the help of the identity (A.4) we 
find 



n-1 



n-1 



E 



(AW- 



V[S n ]=Y,v[(AW J ) 2 ]=Y,(E[(AW J ] 

3=0 j=0 V 

n—1 n—l n—l 

=2 £ V[{AW 3 ] 2 = 2 £(Ar,) 2 = 2^(r/n) 2 = 2r 2 /n . 

j=0 j=0 j=0 



(A.5) 



Now, since V^] — > , n — > oo and £'[S' n ] — > r , n — > oo, we will be bold 
enough to draw the conclusion (inspired by the limit sum): 

n—l 

Jim S n (r) = hm g(AW,) 2 = r =► ^(^W) 2 = jT rfr . (A.6) 

Now we are ready to calculate the differential df(X(r), r) and start by Taylor 
expanding / to second order: 

df = f x dX + f T dr + \f xx {dX) 2 + \f TT {dr) 2 + f XT dX dr . 

Substituting Eq. (A.l) for dX and letting the identity (A.6) justify the order- 
ing dW > dr = (dW) 2 > dWdr > (dr) 2 , we get 



df = {f T + fifx + \<y 2 fx X ) dr + af x dW , 



(A.7) 



where only the two lowest orders (dW and dr) have been kept. This is the 
sought Ito's formula, which is more rigorously derived in Ref. [10]. 
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B General linear parabolic PDE 



The backward Monte-Carlo method introduced in section 3 can be used to 
solve much more general linear parabolic PDEs than Eq. (1). In this appendix 
we will generalize Eqs. (6) and (7) to solve the following equation: 

Bf 8 Bf 

with the initial condition f(x k , 0) = §(x k ), and D M = D M (x k , t), A = \(x k , t), 
and S = S(x k ,t). We will again try to find the Feynman-Kac representation 
of f(x k ,t), and to do so we need the naturally associated SDEs. The matrix 
D kl is in general not diagonal. In other words, the diffusion processes along 
the different axes are in general correlated to some degree and the SDEs take 
the form: 

dX k = fi k dr + A kl dW l . 

Ito's formula is trivial to generalize to multiple dimensions, and applying it to 
/ we find the identities: 

k _ dD^_ 

and 

A km A im = 2D k£ _ ( R2 ) 

Finding the Feynman-Kac representation of the solution to Eq. (B.l) is straight- 
forward, with A(t) = exp[/ * \(X k (s), s) ds] we get 



f{x\T)=E 



A(T)$(X fc (0)) + ( T A(t)S(X k (t),t)dt X k (T) 
Jo 



x k 



with the numerical approximation 

f(x k ,T) = N~ 1 f:i [ A(T)<S>(X k (0)) + / o T A(t)5(Xf(t),t)rft| , (B.3) 
,=i 

where 

X k (t - At) = X k {t) + //At + ( e A ke \fAl , X k (T) =x k , i — ... ,N . 

(B.4) 

Here, ( e are uncorrelated, zero-mean, unit-variance Gaussian random num- 
bers, ( e G N(0, 1), and the matrix elements A solve the system of algebraic 
Eqs. (B.2). 
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C Langevin equation 



A literature review on Monte-Carlo methods for solving parabolic equations is 
impossible without occasionally coming across the Langevin equation [12,13]: 

^ = -/3v + A(t). (C.l) 

Here v is the velocity of a particle, and A(t) is a "fluctuating" acceleration. 
The Langevin equation was historically used to model Brownian motion [14]. 

The following assumptions are being made about the "fluctuating" term A(i): 

• A(t) is independent of v. 

• A(t) varies extremely rapidly compared to the variations of v. 

It should come as no surprise that the second assumption is problematic. To 
quote Chandrasekhar [13]: "But we should draw attention even at this stage 
to the very drastic nature of assumptions implicit in the very writing of an 
equation of the form (C.l). For we have in reality supposed that we can divide 
the phenomenon into two parts, one in which the discontinuity of the events 
taking place is essential while in the other it is trivial and can be ignored" . 

Since the theory of stochastic calculus [10] is on considerably firmer math- 
ematical footing, the SDE (2) should be allowed to supersede the Langevin 
equation. 
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